Watchdog: reboot Raspberry Pi if network is down

Network down! Help!

I have an issue with one of my Raspberry Pis. It will not survive if Wi-Fi network is down for an extended period of time.

I have not been able to figure out the root cause, or how long the network needs to be unreachable before the Raspberry Pi just drops the network connection and never connects again. This happens occasionally since the Pi is located just a little too far away from my Wi-Fi access point. Due to real estate fire prevention regulation reasons I do not have a cable route available so wireless is the way to go.

Now this wouldn’t be such a big issue, but the Raspberry Pi is a headless one (i.e. it does not have a keyboard or monitor), and is only connected via Wi-Fi network. I have some sensors connected to this Pi, which in turn update a web page. Losing network connectivity causes web page data to be out of date and is quite inconvenient.

Today after work I noticed that the Pi had lost network connectivity (again) at 04:40. I had to make my way to the garage (again) and reboot the device (which has no keyboard or network connectivity) – and my frustration level rose to “I’ll fix this now!” level.

Right. After a few half-assed attempts at ping + reboot scripts I figured there must be a more elegant solution, and I was sure I’m not the first one with this problem.

Enter Watchdog.

Searching the Internet I found a couple of examples, none of which was immediately copy/paste ready for my needs.

They did lead me to right track and after reading some manpages, forums and gist.github.com code snippets, getting my Pi to a constant reboot loop etc. I finally came up with what seems to be a working solution.

The solution

First things first: I used a Raspberry Pi 2 with a USB Wi-Fi module and Raspbian Wheezy (4.1.19-v7+ #858 SMP Tue Mar 15 15:56:00 GMT 2016 armv7l GNU/Linux).

Start by installing watchdog:

sudo apt-get install watchdog

Make a backup copy of /etc/watchdog.conf file just in case:

sudo cp /etc/watchdog.conf /etc/watchdog.conf.backup

Edit the /etc/watchdog.conf file to contain the following. There is a short comment on each line about what they do:

$ sudo nano /etc/watchdog.conf
# Watchdog ping: if unresponsive, reboot:
interface = wlan0    # use interface wlan0
ping-count = 5       # ping 5 times
ping = 192.168.1.1   # ping test destination IP address
# Change default interval from 1 second to 20:
interval = 20        # perform watchdog checks every 20 seconds

then reboot (e.g. sudo reboot).

The above will ping five (5) times for destination address 192.168.1.1 every 20 seconds. I’m not sure if the interface command is actually needed, but it did not do any harm so I left it there.

192.168.1.1 is my default gateway, and I really do want to test connectivity against this address instead of some random host in the Internet, since I don’t want my Pi to reboot in case the Internet connection is down. If you insist on pinging a host in the Internet (not recommended), good choice might be Google public DNS servers (8.8.8.8 and 8.8.4.4) or any other host of your choice.

I did use Google DNS server for testing purposes, since it was easier to cut the connection to the Internet but maintain local area network (LAN) connectivity for management purposes.

Watchdog writes log to syslog (/var/log/syslog), and when the ping test fails, this is what it will look like (note the target here is Google DNS 8.8.8.8, not my internal network default gw):

Oct 24 20:35:42 localhost watchdog[2640]: ping: 8.8.8.8
Oct 24 20:35:42 localhost watchdog[2640]: no response from ping (target: 8.8.8.8)

When there is no response to any of the five (5) ping echo requests, the Raspberry will reboot. I will forget this, so I inserted the following to /home/pi/.profile:

echo "Warning: If network is down, this system will reboot in 20 seconds. Comment out ping = 192.168.1.1 line from /etc/watchdog.conf to avoid reboots."

That will print a warning message plus info which file to configure if needed – every time I log in. Even I should be now able to remember where Watchdog is configured 🙂

Further reading

Below you can find a list of some resources which did help me with the solution:

  1. Watchdog man page: https://www.systutorials.com/docs/linux/man/8-watchdog/
  2. Watchdog.conf man page: https://www.systutorials.com/docs/linux/man/5-watchdog.conf/
  3. Good explanation of watchdog.conf ping variable: http://www.sat.dundee.ac.uk/psc/watchdog/watchdog-configure.html#Network_ping
  4. Reddit, a solution which almost worked: https://www.reddit.com/r/raspberry_pi/comments/4ih9xo/id_like_my_routeronastick_vpn_to_autorestart/d2y3yj4/?context=3
  5. Non-watchdog script to solve the same issue (not mine): https://gist.github.com/SandroMachado/87e591fc42f368636b251b566485ae46
  6. Another non-watchdog script (again, not mine): http://weworkweplay.com/play/rebooting-the-raspberry-pi-when-it-loses-wireless-connection-wifi/

Improvements

I have only had this setup running now for a couple of hours, so it’s unclear if it will really work over time. Hope so, I’ll update this article if needed.

What else I could do with Watchdog? Well, I could certainly improve this to check that my Python programs do not die – or at least react and restart them automatically when they do die. Or if the system bogs down and stays unresponsive for extended periods of time.

What do you think? If you have suggestions, improvements or indeed more experience than I do, please do leave a comment below.

Follow-up

Update 30.3.2018: Reminded by Will in the comments (thanks!), I noticed I have not followed up on my promise to update this article.

I have not had any problems with performance or reboot loops etc. with the script. It just works as intended. The 20 second interval is a bit tight, but it’s entirely doable to stop the script in that time if needed.

How would I improve this? Well, I’d create a separate reboot log instead of relying on syslog, but it’s really not necessary. It would help for statistics collection over a longer time period but like said, not necessary.

7 thoughts on “Watchdog: reboot Raspberry Pi if network is down

  1. Hi Will, thanks for your comment!

    It seems I have forgotten my promise about updating the article – shame on me.

    Watchdog has been working perfectly. I can see that it occasionally does reboot the Pi, and I have not even once had problems with the script as of yet. I have been updating the Pi via apt-get normally, and no conflicts have emerged. The configuration in the above article has not been changed so it’s still valid.

    I’m very happy with the solution.

    Like

  2. Thank you. I have found your writeup to be easy compared to others I have seen. I even spent 20 min today trying to find this exact page to replicate on another pi. You have been bookmarked!

    Thanks again for an easy to follow tutorial, and I think the echo reminder is a great idea!

    Liked by 1 person

  3. Hi , I am struggling with the same problem. I read your small script but Can you please tell where is the reboot command or where should it be written. I didn’t completely understand it.

    Like

  4. I’m curious to know if watchdog adds any unwanted overhead as a simple bash script running in Cron would do the same thing. And with that in mind, making the script simply reconnecting the wifi without the need of a reboot. I found this tutorial searching for ideas for connection issues I’m having with a PiZero W, and actually considered watchdog but a bash script might be easier considering the time it takes my PiZero to reboot. Thoughts?

    Liked by 1 person

    • Hi, Thank you for the comment!

      Could you give an example about the simple bash script in Cron? I’ve had problems with keeping these up and running. I’d be interested in your solution on similar problem.

      My use case does not need to worry about the (arguably) short time the reboot takes, nor do I have any problems with the possible watchdog overhead.

      I’m no specialist by any means, so I cannot really answer – this is my experience and at this time (12/2018) the watchdog has been working perfectly.

      Like

Leave a comment